-
-
Notifications
You must be signed in to change notification settings - Fork 638
Add automatic retry for V8 crash in CI Node.js setup #2082
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This change introduces a custom composite GitHub action that wraps `actions/setup-node@v4` with automatic retry logic to handle transient V8 bytecode deserialization crashes that occur during `yarn cache dir` execution. The V8 crash manifests as: "Fatal error in , line 0 Check failed: ReadSingleBytecodeData(...) == 1" Key improvements: - New composite action `.github/actions/setup-node-with-retry` that: - Pre-validates yarn cache dir works before running setup-node - Automatically retries up to 3 times on V8 crashes - Fails fast on non-V8 errors without retrying - Provides clear warnings in CI logs when retries occur - Updated all CI workflows to use the new action: - examples.yml - Re-enabled yarn caching (was disabled due to this issue) - integration-tests.yml - Re-enabled yarn caching for Node 22 - lint-js-and-ruby.yml - package-js-tests.yml - playwright.yml - pro-integration-tests.yml - pro-lint.yml - pro-test-package-and-gem.yml This resolves transient CI failures and allows us to re-enable yarn caching across all workflows, improving CI performance. Related issues: - nodejs/node#56010 - actions/setup-node#1028 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
WalkthroughIntroduces a new composite GitHub Action for setting up Node.js with automatic retry logic to handle V8 bytecode deserialization failures. Updates nine workflow files to use this new action instead of the standard Changes
Sequence DiagramsequenceDiagram
autonumber
participant Workflow
participant RetryAction as setup-node-with-retry
participant SetupAttempt as setup_node_attempt
participant OfficialSetup as actions/setup-node@v4
Workflow->>RetryAction: Invoke with inputs
loop Retry Loop (up to MAX_RETRIES)
RetryAction->>SetupAttempt: Execute attempt
SetupAttempt->>SetupAttempt: Assess yarn cache dir
alt V8 Crash Detected
SetupAttempt-->>RetryAction: Error pattern matched
RetryAction->>RetryAction: Sleep & increment counter
else Success or Non-Crash Error
SetupAttempt-->>RetryAction: Exit with status
break Exit Loop
end
end
end
rect rgb(200, 220, 255)
Note over RetryAction,OfficialSetup: On successful retry completion
end
RetryAction->>OfficialSetup: Execute with provided inputs
OfficialSetup-->>Workflow: Node setup complete
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes
Areas requiring extra attention:
Possibly related PRs
Suggested labels
Suggested reviewers
Poem
Pre-merge checks and finishing touches✅ Passed checks (3 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Code Review: Add automatic retry for V8 crash in CI Node.js setupOverall Assessment✅ Excellent solution to a real production issue. The approach is well-thought-out and implements a focused, reliable retry mechanism for a known transient failure. ✅ Strengths
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (1)
.github/workflows/playwright.yml (1)
52-55: Consider explicitly specifyingcache-dependency-pathfor clarity.The setup step omits
cache-dependency-path, relying on the composite action's defaults. While this likely works (most actions default to**/yarn.lock), explicitly passing this input would improve readability and consistency with other workflows.Optional improvement:
- uses: ./.github/actions/setup-node-with-retry with: node-version: '20' cache: 'yarn' + cache-dependency-path: '**/yarn.lock'
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (9)
.github/actions/setup-node-with-retry/action.yml(1 hunks).github/workflows/examples.yml(1 hunks).github/workflows/integration-tests.yml(2 hunks).github/workflows/lint-js-and-ruby.yml(1 hunks).github/workflows/package-js-tests.yml(1 hunks).github/workflows/playwright.yml(1 hunks).github/workflows/pro-integration-tests.yml(3 hunks).github/workflows/pro-lint.yml(1 hunks).github/workflows/pro-test-package-and-gem.yml(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: pro-lint-js-and-ruby
- GitHub Check: examples (3.4, latest)
- GitHub Check: rspec-package-tests (3.4, latest)
- GitHub Check: build
- GitHub Check: build-dummy-app-webpack-test-bundles
- GitHub Check: build-dummy-app-webpack-test-bundles
- GitHub Check: claude-review
🔇 Additional comments (13)
.github/workflows/pro-test-package-and-gem.yml (1)
95-100: LGTM—consistent setup-node-with-retry usage across jobs.Both Node setup steps use the new composite action consistently with appropriate inputs for the pro package structure.
Also applies to: 196-201
.github/workflows/pro-integration-tests.yml (1)
95-100: LGTM—consistent setup-node-with-retry across all jobs.All three Node setup steps are correctly configured with the new retry action and appropriate pro-specific paths.
Also applies to: 191-196, 388-393
.github/workflows/lint-js-and-ruby.yml (1)
95-100: LGTM—proper setup-node-with-retry configuration.The Node setup is correctly configured with explicit cache settings and appropriate dependency path.
.github/workflows/examples.yml (1)
116-123: LGTM—well-documented setup-node-with-retry usage.The comments clearly explain that the retry logic now handles V8 crashes automatically, and the tracking link to the setup-node issue is helpful for future maintainers.
.github/workflows/integration-tests.yml (1)
118-125: LGTM—consistent setup-node-with-retry with matrix support.Both Node setup steps are correctly configured to support matrix-driven node versions (20 and 22) with proper retry and cache handling. The updated comments clearly document the V8 crash handling.
Also applies to: 197-204
.github/workflows/pro-lint.yml (1)
93-98: Composite action is properly implemented. All verification points confirmed.The action correctly:
- Implements retry logic for V8 bytecode deserialization errors with configurable max-retries (default 3)
- Tests
yarn cache dirto detect V8 crashes with 30-second timeout- Provides proper error handling and logging
- Delegates to
actions/setup-node@v4with all passed parameters after retry logic.github/workflows/package-js-tests.yml (1)
97-105: I need to search for this bug using different terms since the specific issue number wasn't found. Let me gather more information:Let me search more specifically for Node.js 22.21.0 release information:
Based on my web search attempts, I was unable to find any confirmation that Node.js issue #56010 relates to a V8 bytecode deserialization bug. In fact, the only Node.js issue #56010 I found references a SQLite dependency update, not a V8 bug. Additionally, I could not locate any public issue describing a specific V8 bytecode deserialization bug affecting Node.js 22.21.0 during yarn cache operations.
However, I did find evidence that Node.js 22 has had some yarn-related issues (e.g., Node 22.5.0 breaks yarn install), which suggests the underlying concern about Node 22 stability with package managers may have merit—just not with the specific issue number cited.
Cannot verify the cited Node.js issue #56010 exists as described. The issue reference appears incorrect, and the specific V8 bytecode deserialization bug cannot be confirmed through public sources.
.github/actions/setup-node-with-retry/action.yml (6)
1-18: Metadata looks good. Inputs are well-documented and align withactions/setup-node@v4API.
20-36: Initialization is correct. Environment variables and counters properly set up.
38-77: Pre-validation logic is sound. The function correctly distinguishes V8 crashes (retryable) from other errors (fail-fast), with proper cleanup.
79-94: Retry loop is correct. Proper counter management, 5-second delays, and fail-fast on exhaustion.
101-106: Action call is appropriate for the design. Per the PR objectives, the pre-validation approach (check yarn cache dir before invoking setup-node) is intentional. This prevents transient yarn cache V8 crashes without overcomplicating the workflow.However, please clarify one concern: if
actions/setup-node@v4itself crashes with a V8 error outside the yarn cache dir operation, there's no retry for that case. Verify that the V8 crashes observed in CI are consistently happening during the pre-setup yarn cache validation, not during the actual node setup step itself.
15-18: Input defaults are appropriate. Themax-retriesdefault of'3'with numeric string is safe and used correctly in bash arithmetic/comparison contexts.
Improvements: - Remove unreachable code (lines 96-99 that checked SUCCESS after loop) - Add explicit timeout detection (exit codes 124, 143) - Add clearer logging for non-yarn cache types - Rename function to test_yarn_cache for clarity - Add comment acknowledging limitation that pre-validation doesn't prevent crash in setup-node itself, but catches it early - Improve error handling to exit immediately on non-retryable errors - Add better documentation of what the script actually does This addresses code review feedback while maintaining the practical benefit of catching V8 crashes early in the setup process. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
Code ReviewThis PR adds a custom composite action to handle transient V8 crashes during Node.js setup. Overall, this is a well-designed solution to a challenging problem. ✅ Strengths1. Excellent Problem Analysis
2. Smart Approach
3. Good Error Handling
4. Well-Structured Code
🔍 Potential Issues & Suggestions1. Shell Script RobustnessIssue: The regex pattern in line 64 might not match if the error message format changes slightly. Suggestion: Consider a more flexible pattern to catch variations of the V8 error message. 2. Temp File Cleanup Edge CaseIssue: If the script is killed mid-execution, temp files might not be cleaned up. Suggestion: Add a trap to ensure cleanup on EXIT. 3. Limited ScopeObservation: The action only validates yarn cache dir but doesn't retry if the actual setup-node action crashes. Impact: As documented, this is a known limitation. Consider adding a comment explaining why we can't wrap/retry setup-node itself. 4. Hardcoded ValuesMinor: The timeout (30s) and sleep duration (5s) are hardcoded. Consider making these configurable inputs with defaults. 5. Non-Yarn Cache TypesQuestion: For npm or pnpm cache types, the validation is skipped. Could these also experience V8 crashes during their cache operations? 6. Testing ConcernsObservation: This code is difficult to test since it requires reproducing the V8 crash. Consider adding a test mode input to simulate failures. 🔒 Security Considerations✅ No security concerns identified
🎯 Performance Considerations✅ Positive impact
Minor concern: In worst case (3 failed attempts), adds ~105 seconds to CI run. However, this is acceptable given the alternative is a completely failed job. 📊 Overall AssessmentRating: ⭐⭐⭐⭐ (4/5) This is a pragmatic and well-executed solution to a difficult problem. The approach of pre-validating before setup-node is clever, and re-enabling yarn caching is a significant win for CI performance. The code is clean, well-documented, and follows best practices. The identified issues are mostly minor improvements rather than blockers. Recommendation: ✅ Approve with minor suggestions The PR is ready to merge as-is. The suggested improvements can be addressed in follow-up PRs if needed. 🤖 Generated with Claude Code |
Code Review - PR #2082Thanks for this comprehensive PR addressing the V8 crash issue! The solution is well-designed and thoroughly implemented. Here's my detailed review: ✅ Strengths1. Excellent Problem Analysis
2. Robust Implementation
3. Comprehensive Coverage
🔍 Observations & Suggestions1. Shell Script Best Practices ✅The script handles edge cases well:
2. Potential Race Condition
|
Code Review - PR #2082: Add automatic retry for V8 crash in CI Node.js setupSummaryThis PR introduces a well-designed solution to handle transient V8 bytecode deserialization crashes in CI by creating a custom composite GitHub action with automatic retry logic. Overall, this is a solid implementation that addresses a real pain point while re-enabling yarn caching for improved CI performance. ✅ Strengths
🔍 Code Quality Observations✅ Good Practices
|
Summary
This PR introduces a custom composite GitHub action that pre-validates yarn cache functionality before running Node.js setup, catching transient V8 bytecode deserialization crashes early in the CI process.
Problem
CI jobs occasionally fail with this V8 crash during the
yarn cache dircommand execution inactions/setup-node@v4:This is a known Node.js/V8 bug that occurs sporadically:
sqlite3ext.hnodejs/node#56010Previous workarounds were to disable yarn caching entirely, which significantly slowed down CI.
Solution
Created a new composite action
.github/actions/setup-node-with-retrythat:yarn cache dirworks before runningsetup-nodeImportant Limitation
The pre-validation approach doesn't prevent the V8 crash from potentially occurring again when
setup-noderuns its ownyarn cache dir. However, in practice, if the validation succeeds, setup-node typically succeeds as well. This approach:A more complete solution would require rewriting this as a JavaScript action that directly wraps the
@actions/toolkitcache operations, but this simpler approach provides substantial practical benefit.Changes
Updated all CI workflows to use the new action:
examples.yml- Re-enabled yarn caching (was disabled due to this issue)integration-tests.yml- Re-enabled yarn caching for Node 22lint-js-and-ruby.ymlpackage-js-tests.ymlplaywright.ymlpro-integration-tests.ymlpro-lint.ymlpro-test-package-and-gem.ymlBenefits
actions/setup-node@v4Test Plan
🤖 Generated with Claude Code